Deep Noise Suppression Maximizing Non-Differentiable PESQ Mediated by a Non-Intrusive PESQNet
نویسندگان
چکیده
Speech enhancement employing deep neural networks (DNNs) for denoising is called noise suppression (DNS). The DNS trained with mean squared error (MSE) losses cannot guarantee good perceptual quality. Perceptual evaluation of speech quality (PESQ) a widely used metric evaluating However, the original PESQ algorithm non-differentiable, therefore, directly be as optimization criterion gradient-based learning. In this work, we propose an end-to-end non-intrusive PESQNet DNN to estimate scores enhanced signal. Thus, by providing reference-free loss, it serves mediator towards training, allowing maximize score We illustrate potential our proposed -mediated training on strong baseline DNS. As further novelty, train and alternatingly keep up-to-date perform well specifically under training. Detailed analysis shows that mediation increases performance about 0.1 points synthetic test data 0.03 DNSMOS real data, compared MSE-based loss. Our method outperforms Interspeech 2021 Challenge 0.2 data. Furthermore, improves same approximated differentiable loss 0.4
منابع مشابه
Intrusive and non-intrusive watermarking
“Can we watermark without perturbing an image?” We present the salient results of the investigation carried out to find an answer to this question.
متن کاملNon-Intrusive Deep Tracing of SCI Interconnect Traffic
The Scalable Coherent Interface (SCI) is one of the enabling interconnect technologies for high performance computing on PC Clusters. Trinity College Dublin has designed and is currently prototyping a trace instrument that allows deep traces of SCI interconnect traffic. Such an instrument is essential for a detailed spatial and temporal analysis of parallel executed algorithms on loosely couple...
متن کاملNon-intrusive liveness detection by face images
A technique evaluating liveness in face image sequences is presented. To ensure the actual presence of a live face in contrast to a photograph (playback attack), is a significant problem in face authentication to the extent that anti-spoofing measures are highly desirable. The purpose of the proposed system is to assist in a biometric authentication framework, by adding liveness awareness in a ...
متن کاملNoise Suppression with Non-Air-Acoustic Sensors
Nonacoustic sensors such as the general electromagnetic motion sensor (GEMS), the physiological microphone (P-Mic), and the electroglottograph (EGG) offer multimodal approaches to speech processing and speaker and speech recognition. These sensors provide measurements of functions of the glottal excitation and, more generally, of the vocal tract articulator movements that are relatively immune ...
متن کاملNon-auditory Effects Caused by Environmental Noise Pollution
Noise pollution is one of the prominent environmental problems affecting human health especially in developing countries. The impacts of noise on health should not be underestimated. Exposure to acoustical stimuli impairs not only the function of auditory system but also that of many other systems of human body. Previous investigations revealed that noise exposure could result in sleep disturba...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: IEEE/ACM transactions on audio, speech, and language processing
سال: 2022
ISSN: ['2329-9304', '2329-9290']
DOI: https://doi.org/10.1109/taslp.2022.3165442